Synthesising attitudes with global rhythmic and intonation contours

نویسندگان

  • Yann Morlec
  • Gérard Bailly
  • Véronique Aubergé
چکیده

We present here a trainable generative model of French prosody. We focus on the sentence level and design SNNs able to generate both rhythmic and intonation contours for diverse attitudes. First results of a perceptual test show that listeners are able to retrieve the right definition of attitudes by listening to synthetic PSOLA stimuli. 1. THEORETICAL FRAMEWORK In our theoretical framework prosody can be described as the superposition of independent multiparametric prosodic contours belonging to diverse linguistic levels [1]: sentence, clause, group, subgroup... These prototypical movements are progressively stored in a prosodic lexicon and dynamically used by the speaker to mark (segmentation...), enlight (salience) and enrich (attitudes...) the linguistic structuration of his discourse. In our approach, each syllable participates in the encoding of each linguistic level and higher levels can use whatever melodic or rhythmic variations to express linguistic representations. This theoretical framework contrasts with most popular models described in the literature: – Tonal approaches such as promoted by prosodic phonology where intonation is described with local events such as tones and breaks, the function of which are described [9] by higher phonological constructs such as the intonational, phonological phrase or word. – Superpositional models only based on physical or geometric [7] parameters such as cut-off frequencies or declination lines. – Data fitting or purely lexicon-based approaches, where synthesis is reduced to adequate and accurate labelling [4]. The model proposed here makes strong assumptions on the way linguistic and paralinguistic attributes are encoded in prosody. The main challenge of our work is to demonstrate that parameters of this model may be learned in order to adequately and accurately predict a multiparametric prosodic continuum. Input layer 3 F0 values (1/4tones) per IPCG IPCG Ratio 1 0 Number of syllables

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hierarchical intonation model for synthesising F0 contours in galician language

In this contribution we propose a hierarchical intonation model for synthesising f0 contours with application to text-to-speech synthesis in Galician language. This model makes use of the implicit knowledge that resides in a database of natural f0 contours obtained from a read corpus. The novelty of this method lies on the way the f0 contour is generated. First, no phonological description in t...

متن کامل

Generating the Prosody of Attitudes

This paper presents a superpositional model of prosody where linguistic structures are directly encoded into the prosodic parameters via global melodic and rhythmic contours. This model is applied here to the generation of prosody specific to six common attitudes in French. Perceptual tests using high-quality TD-PSOLA re-synthesis show that predicted contours yield the same identification score...

متن کامل

CoPaSul Manual - Contour-based parametric and superpositional intonation stylization

The purposes of the CoPaSul toolkit are (1) automatic prosodic annotation and (2) prosodic feature extraction from syllable to utterance level. CoPaSul stands for contour-based, parametric, superpositional intonation stylization. In this framework intonation is represented as a superposition of global and local contours that are described parametrically in terms of polynomial coefficients. On t...

متن کامل

Vocative intonation preferences are sensitive to politeness factors.

Although intonation has been traditionally associated with the expression of attitudes and intentions on the part of the speaker, little is known about whether sociopragmatic factors, such as power or social distance, or situational ones, like physical distance or insistence, can constrain the use and felicity of pitch contours. This article investigates the felicity conditions underlying the c...

متن کامل

Connecting stimulus-driven attention to the properties of infant-directed speech - Is exaggerated intonation also more surprising?

The exaggerated intonation and special rhythmic properties of infant-directed speech (IDS) have been hypothesized to attract infant’s attention to the speech stream. However, studies investigating IDS in the context of models of attention are few. A number of such models suggest that surprising or novel perceptual inputs attract attention, where novelty can be operationalized as the statistical...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997